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ABSTRACT 

Next-generation radio arrays, including the SKA and its pathfinders, will open up new avenues for 
exciting transient science at radio wavelengths. Their innovative designs, comprising a large number 
of small elements, pose several challenges in digital processing and optimal observing strategies. The 
Giant Mctrc-wavc Radio Telescope (GMRT) presents an excellent test-bed for developing and validating 
suitable observing modes and strategics for transient experiments with future arrays. Here we describe 
the first phase of the ongoing development of a transient detection system for GMRT that is planned to 
eventually function in a commensal mode with other observing programs. It capitalizes on the GMRT's 
interferometric and sub-array capabilities, and the versatility of a new software backend. We outline 
considerations in the plan and design of transient exploration programs with interferometric arrays, and 
describe a pilot survey that was undertaken to aid in the development of algorithms and associated 
analysis software. This survey was conducted at 325 and 610 MHz, and covered 360 deg^ of the sky 
with short dwell times. It provides large volumes of real data that can be used to test the efficacies of 
various algorithms and observing strategies applicable for transient detection. We present examples that 
illustrate the methodologies of detecting short-duration transients, including the use of sub-arrays for 
higher resilience to spurious events of terrestrial origin, localisation of candidate events via imaging and 
the use of a phased array for improved signal detection and confirmation. In addition to demonstrating 
applications of interferometric arrays for fast transient exploration, our efforts mark important steps in 
the roadmap toward SKA-era science. 

Subject headings: methods: observational - instrumentation: interferometers - pulsars: individual 
(J1752-2806, Crab pulsar) - techniques: interferometric 



1. INTRODUCTION 

The transient Universe has remained a major astrophys- 
ical frontier over the past few decades. Transient phenom- 
ena are known on time scales ranging from as short as 
sub-nano seconds to years or longer, thus spanning almost 
20 orders of magnitude in time domain. Such emission 
is thought to be likely indicators of explosive or dynamic 
events and hence provide enormous potential to uncover a 
wide range of new astrophysics (e. g. Cordes et al. 2004b). 

While the transient sky at high energies (X- and 7-rays) , 
and to some extent at optical wavelengths, are routinely 
monitored for transient and variable phenomena by a num- 
ber of wide field-of-view instruments, it remains a largely 
uncharted territory at radio wavelengths. Most previous 
high-sensitivity radio surveys (for pulsars and transients) 
have used large single dishes which, by definition, have rel- 
atively narrow fields-of-view. In addition, for the case of 
detection of short-duration transients ("fast transients", 
time scales of ^microseconds to ~seconds), there have 
been additional challenges such as the large signal pro- 
cessing overheads arising from the need to correct for ef- 
fects such as dispersion, and the ever-increasing number 
of radio frequency interference sources. These challenges 
have limited the scope of rigorous explorations of the radio 
transient sky. 

There are now a suite of new radio facilities in the de- 



sign, construction or commissioning stages, many of which 
will offer wide field-of-view capabilities and thus open up 
new avenues of discovery. These are either multi-element 
radio arrays with moderate to large number of small-sized 
elements (dishes), or those comprising elements with na- 
tively wide field-of-view (i.e. aperture arrays). Examples 
include the newly operational Low Frequency Array (LO- 
FAR) and the Murchison Widefield Array (MWA), as well 
as upcoming SKA pathfinder instruments, viz. the Aus- 
tralian SKA Pathfinder (ASKAP) in Western Austraha 
and MeerKAT in South Africa (Stappers et al. 2011; Tin- 
gay et al. 2012; Johnston et al. 2007; Booth et al. 2009). 
In principle, these instruments can provide large field of 
view (FoV) observations; however, they also present sig- 
nificant challenges in terms of the associated signal pro- 
cessing costs. Fortuitously, with the recent advances in 
affordable super computing and the use of graphics pro- 
cessing units in astronomical computing, this is fast be- 
coming less of a challenge (e.g. Barsdell et al. 2010; Magro 
et al. 2011). Therefore, optimistically, the availability of 
such next-generation arrays, together with appropriate in- 
strumentation and suitable data archiving and processing 
strategies, can potentially revolutionize our knowledge of 
the transient radio sky in the coming decades. 

The scientific potential of radio transients has been well 
underscored in a number of recent reviews (e.g. Cordes et 
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al. 2004b; Cordcs 2009; Fender & Bell 2011; Bhat 2011). 
A wide variety of transient phenomena are known at ra- 
dio wavelengths. While pulsar radio emission time scales 
range from milliseconds (sub-pulscs) to nanoseconds (giant 
pulses), phenomena such as solar or stellar bursts, flares 
from Jupiter-like planets and brown dwarfs, micro-quasar 
emission, and gamma-ray burst (GRB) afterglows are of 
much longer durations (e.g. Chandra & Frail 2011). Some 
known radio transients have been discovered in follow- 
up observations of higher-energy detections; for exam- 
ple, gamma-ray burst afterglows and periodic pulsations 
from magnetars (Camilo et al. 2006; Levin et al. 2010). 
Other discoveries include transient sources in the direc- 
tion of the Galactic Centre (CC) (Hyman ct al. 2005; 
Bower et al. 2007; Roy et al. 2010) found through time- 
resolved VLA imaging of the GC, rotating radio transients 
(RRATs) found in transient searches of archival pulsar sur- 
veys (McLaughhn et al. 2006; Keane & McLaughlin 2011), 
and the possibly cxtragalactic millisecond bursts reported 
by Lorimer et al. (2007) and Keane et al. (2012). 

A distinction is often made between "slow" versus "fast" 
transients in the context of radio astronomy (cf. Cordes 
2009); slow transients can be detected through standard 
imaging of brief or long time integrations, while fast tran- 
sients require data collection with sufficiently high time 
and frequency resolution to correct for dispersive delays 
before detection is attempted. This paper is concerned 
with the detection of fast transients. These are often linked 
to coherent radiation processes and, frequently, to sources 
in extreme matter states (e.g. Cordes et al. 2004a). They 
are affected by plasma propagation effects such as disper- 
sion and, if the source is compact, by multi-path scattering 
and/or scintillation by the intervening media; hence, they 
may also servo as excellent probes of such media. 

As noted earlier, impulsive radio frequency interference 
(RFI) can potentially mimic signatures of real signals, and 
their frequent occurrence may impact an observation's sen- 
sitivity, thereby making weaker signals difficult to detect 
(e.g. Bhat et al. 2005). Interferometric instruments offer 
several unique advantages here. The distributed nature 
of array elements and long baselines can be exploited to 
identiiy and eliminate a wide range of RFI-generated tran- 
sients. For example, voltage data can be correlated be- 
tween elements to find fringes for the pulse, hence obtain- 
ing a sky position and localizing the detection. Most ongo- 
ing fast transient explorations, with the exception of the 
VLBA-based V-FASTR project (Wayth et al. 2011) and 
the LOFAR pulsar survey project (Coenen et al. 2012), use 
large single-dish instruments such as Parkes and Arecibo 
(Dcneva et al. 2009; Burkc-Spolaor ct al. 2011), which of- 
fer none of those advantages, both because of their lower 
resilience to RFI and also because the data are typically 
pre-processed prior to recording. 

Despite the clear advantages of interferometric transient 
searches, exploiting such arrays will require considerable 
planning and exploratory research. As neither of the con- 
ventionally employed observing strategies, such as inco- 
herent (i.e. phase-insensitive) addition of antenna signals 
or a single phased-up array, are optimal for conducting 
large sky surveys, some new strategies will need to de- 
veloped and experimented in order to fully exploit ar- 
ray instruments (e.g. Janssen et al. 2009; Stappers et al. 
2011; Coenen et al. 2012; Rubio-Herrera et al. 2013). Re- 



cently, Macquart (2011) and Colegate & Clarke (2011) ap- 
proached the problem from the point of optimizing large- 
sky surveys within the context of next-generation array 
instruments including the SKA, and both advocate inco- 
herent combination of antenna signals as optimal strate- 
gies to achieve the highest detectable event rates. Ex- 
isting arrays (e.g. GMRT, VLBA, LOFAR) can mean- 
while demonstrate effective strategies that will be appli- 
cable when next-generation arrays are constructed. 

A number of salient features make the GMRT (Swarup 
et al. 1991) a powerful test-bed in this context. This low- 
frequency array of 30 x 45-m dishes, operating at 5 differ- 
ent frequency bands in the range 0.15 to 1.5 GHz and with 
an effective collecting area Aeff ^3% SKA, offers several 
unique design features. Its moderate number of elements, 
relatively long baselines (up to ^25 km) and sub-array 
capabilities maJse it an excellent analog for SKA-like plat- 
forms. Furthermore, GMRT's new software backend (Roy 
et al. 2010) allows raw voltage data from individual ar- 
ray elements to be rerouted to software-based processing 
systems. 

Here we will describe ongoing efforts to equip the GMRT 
for transient exploration by (i) designing a software based 
system that will eventually function commensally with 
other observing programs, and (ii) undertaking pilot sur- 
veys that help demonstrate observational methodologies. 
This paper will focus on algorithms and methodologies, 
while the detailed implementation of a real-time processing 
pipeline and science results from pilot surveys are deferred 
to future papers. Apart from demonstrating the applica- 
tion of a "large-N, small-D" (LNSD) type instrument for 
transient explorations, these efforts will also enable new 
science with the GMRT. This is especially important given 
that the GMRT transients surveys will complement other 
similar efforts around the world in sky and frequency cov- 
erage. 

This paper is organised as follows. In §2, we outline con- 
siderations that drive transient exploration strategies with 
interferometric instruments. In §3 we highlight unique ad- 
vantages of the GMRT for this topic, and describe pilot 
surveys undertaken to aid the necessary technical develop- 
ment. Details of our transient detection pipeline are dis- 
cussed in §4, and applications to real data are presented in 
§5. In §6 we discuss our event analysis pipeline and present 
examples illustrating important methodologies. In §7 we 
comment on possible future directions and in §8 we present 
our conclusions. 

2. INTERFEROMETRIC ARRAYS FOR TRANSIENT 

SEARCHES: CONSIDERATIONS AND STRATEGIES 

In this section we discuss various considerations in 
searching for fast transient signals with interferometric in- 
struments. We discuss various technical and sensitivity 
considerations that arise from the distributed nature of 
array elements, the role of propagation effects in signal 
detection and analysis, the importance of searching over a 
large parameter space and the use of long baselines to serve 
as spatial filters against RFI. While much of our discus- 
sion is presented within the context of the GMRT, we em- 
phasise that these discussions are also applicable to other 
similar, particularly low-frequency, array instruments. 
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Fig. 1. — Ratios of the probabilities of false alarms, Pc/Pa> where is the probability for a sub-array where the signals from all the 
N antennas are added incoherently, and Pc is the joint probability for the case of p sub-arrays each having n = N/p antennas. Left panel: 
The ratios are plotted as a function of the number of antennas in a sub-array and for a detection threshold T/cr = 3, for the total number of 
antennas N = 3, 12, 21 and 30 respectively. Right panel: the same quantity Pc/Pa is now plotted as a function of the threshold (in units of 
ct) for various sub-array combinations (n = 30, 15, 10, 6, 5, 3, 2, 1) from the top to the bottom curve, for the case of Af = 30 (note that the 
curve for n = 30 corresponds to the top horizontal line at log Pel Pa = 0) ^ see the text for more details. 



2.1. Technical and Sensitivity considerations 

Array instruments can be used either in "incoherent ar- 
ray" (lA) or "phased array" (PA) modes for time-domain 
apphcations such as observing pulsars (Gupta et al. 2000), 
and in principle, similar strategies can be considered for 
the detection of fast transients. lA and PA correspond to 
modes which maximise the FoV and detection sensitivity 
(or the effective collecting area A^s), respectively. The lA 
mode is good for surveys, however it comes at the expense 
of a significant reduction in overall sensitivity. At the other 
extreme is a fully coherent array mode, where the signals 
from individual elements have to be combined to produce 
(many) phased-array beams within the primary beam in 
order to achieve the full FoV of the single element. This 
can be prohibitively expensive in terms of the real-time 
signal processing costs, as the number of beams goes as 
[D/dY , where D is the physical extent of the array and d 
is the size of the individual element or dish. For instance, 
application to just the central square (1 km x 1 km) of the 
GMRT requires the formation of ^500 beams, whereas 
over ~ 10^ beams will be required in order to realise the 
full FoV and sensitivity of the array. As a general rule, the 
use of phased-array beams for large surveys becomes less 
appealing as the filling factor of the array starts to fall-off. 

An intermediate strategy that tries to optimize the 
trade-off between sensitivity and FoV and to maximise 
Aefi X FoV, while offering additional advantages for tran- 
sient searches, is to use distinct sub-arrays with appropri- 
ately combined signals. These sub-arrays could be inco- 
herent or coherent formations, for which we may then use 
statistical measures on sub-array detections to optimise 
the performance with respect to sensitivity, FoV, radio 
frequency interference (RFI) and excision of false positives 
etc. Here we describe the basis for such a scheme that has 



been implemented and tested using the GMRT array. 

Our basic strategy is to generate a small number of in- 
coherently summed sub-arrays and combine the candidate 
transient event detections from the sub-arrays in a manner 
that optimises the rejection of false positives via suitable 
coincidence filtering techniques. This will preserve the full 
FoV of a single element. To motivate this strategy, we con- 
sider the probability of false alarms in a transient detec- 
tion scheme for various combinations of sub-arrays made 
from an array of N antennas. As described in detail in the 
Appendix, for an array of N elements configured to make 
p subarrays (with n = N/p elements per sub-array), the 
joint false alarm probability is given by 



Pc{> T) = 



Erfc 




(1) 



where T = ra is the detection threshold and Erfc is 
the complementary error function (i.e. r is the detection 
threshold in units of a). As shown in the Appendix, Pq 
can be significantly less than Pa , the false alarm probabil- 
ity for a single sub- array {p ~ l,n = N). This is illustrated 
in Fig. 1, which shows the ratio Pc/Pa as a function of n 
for different cases of N and a fixed value of r = 3.0 (left 
panel), and as a function of r, for different choices of n 
and a fixed value of iV = 30 (right panel). 

As evident from these figures, for an array of 30 antennas 
like the GMRT, the false detection rate can be improved 
by a few orders of magnitude by splitting the array into 4-5 
sub-arrays of 6 to 7 antennas each. It is also clear that the 
improvements increase with a greater value of detection 
threshold. Since r (as defined in the Appendix), is relative 
to cr for the signal from a single antenna, realistic values 
(e. g. a 5-(T threshold) for the different array combinations 
correspond to values of r < 5 (e. g. r ~ 1 corresponds to 
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a ^5-(T detection threshold for a single 30-antenna sub- 
array). For this range of r values, improvements in the 
false detection rate by a factor of 10-100 can be obtained 
by splitting the array into 4-5 subarrays, while using the 
same detection threshold (say 5-a). 

Note that for the sub-array case, operating at the same 
detection threshold corresponds to a lower absolute sensi- 
tivity than the full array case. However, it should be possi- 
ble to trade-off the false positive rate (while still keeping it 
below or comparable to that for the full array case) by re- 
ducing the threshold appropriately, thereby increasing the 
absolute sensitivity and bringing it closer to that of the full 
array. The sub-array case is expected to offer other advan- 
tages that accrue from rejection of false positives, such as 
discriminating against localised RFI; the effectiveness of 
this would depend on the physical extent of the full array, 
and how the antennas are grouped to form the sub-arrays. 

Fig. 2 shows Pc after normalisation to the probabilities 
of false alarms for a single incoherent array of all 30 an- 
tennas. Such plots may serve as useful guides to design 
an optimal observing strategy, e. g. to determine the num- 
ber of sub-arrays required to realize a desired false positive 
rate for a set threshold, or to determine the threshold value 
that will be needed to achieve the desired level of rejection 
for a chosen number of sub-arrays. 

The absolute sensitivity considerations are as follows. 
The sensitivity of a single element is characterised by its 
gain (Ga) and the system temperature (Tgyg ). For a sub- 




Detection Threshold (in units of cr) 

Fig. 2. — The false alarm probability when sub-arrays with 
coincidence filters are used for the rejection of false positives. 
The resultant probability, Pc, is normalized to the probability 
for a single, 30-antenna incoherent array, P4, and is shown as 
a function of the detection threshold r and the number of an- 
tennas per sub-array n. Curves are drawn for log(Pc/PA) = 
-0.1, -1, -2, -4, -6, -8, -10, -12 (top to bottom). These can be 
used as a guide to make the choice in terms of number of sub-arrays 
for a set threshold and desired level of rejection of false positives. 
For example, a 2a threshold (r=2) and Pc/Pa=^00 require dividing 
the array into 10 sub-arrays. 



array of N antennas, a signal is detectable if its peak flux 
density (Spk ) exceeds some minimum flux density as de- 
termined by the radiometer equation: 



^pk,min 



K 



(3 (Tree + ^sky) 
Gn (Aj/7Vp„iWp)V2 



(2) 



where T^^c and Tgky are the receiver and system temper- 



atures, respectively (Tg, 



Tsky for most instru- 



ments), Gn is the net gain of the array in K Jy"^, Az/ is 
the recording bandwidth. A^poi is the number of polarisa- 
tions, Wp is the matched filter width employed in transient 
searching, j3 denotes the loss in signal-to-noise ratio (S/N) 
due to signal digitization, and the factor K is the detection 
threshold in units of rms flux density (cr) . 

For a 30-antenna sub-array on the GMRT, operating 
with a bandwidth of 32 MHz for a 5-(t threshold, the 
achievable sensitivity for a survey would typically range 
from ~1 Jy (at 610 MHz) for Wp=l ms to -0.1 Jy 
for Wp=100 ms. As the single-antenna gain (Ga) and 
Tgys (when pointed to the cold sky) are comparable at 
327 and 610 MHz, the nominal sensitivities are similar at 
both frequencies, though in practice the larger 327 MHz 
sky background (Tgky ) will degrade the sensitivity. 

When using sub-arrays (see e.g.. Fig. 3 where the array 
is divided into five sub-arrays of six antennas each) , how- 
ever, Gn scales as y/nGg,. This leads to a worse sensitivity 
than that for a single A/'-element sub-array by a factor of 
As mentioned above, some or all of this loss can be 
recovered by lowering the threshold by a corresponding 
factor, provided the resultant false positive rate remains 
better than that achievable for the default threshold with 
the single sub-array case. 

2.2. Propagation effects 

The role of plasma propagation effects in fast tran- 
sient detection is discussed in detail by Cordes (2009) and 
Macquart (2011). These include dispersion, pulse broad- 
ening or scattering, and scintillation, and they are due 
to ionised interplanetary, interstellar and/or intergalactic 
media. While for most Galactic sources, the dominant 
contribution is from the interstellar medium (ISM), for 
sources at extragalactic or cosmological distances, there 
may also be significant contributions from the ISM of 
the host galaxy as well as from the intergalactic medium 
(IGM). 

The differential dispersion delay, Atdm (in ms), across 
an observing bandwidth A;/ centred at an observing fre- 
quency u (both in GHz) is given by Afdm ~ 8.3 DM Ai/ , 
where DM is the dispersion measure. For Galactic sources, 
DM can be up to several thousand pc cm^^ at large Galac- 
tic distances or toward the GC. Away from the Galactic 
plane, such large DMs can be expected for signals of ex- 
tragalactic or cosmological origins. 

Scattering (pulse broadening) leads to asymmetric pulse 
shapes with a stretched pulse tail. Measured pulse broad- 
ening times (ra ) scale steeply with the observing fre- 
quency; Td oc i/-3-9±o.2 from observations (Bhat et al. 
2004). Detection will thus become difficult when » 
VFint , the intrinsic width of emission. For S VFint , signal 
detection can still be critically influenced by the degree of 
scattering. While pulse broadening conserves the fluence 
(i. e. integrated flux), the smearing in time leads to smaller 
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pulse amplitudes (i. e. lower peak flux densities), and hence 
lower signal-to-noise in the detection. Scattering can thus 
play an important role in defining optimal search strategies 
with low frequency arrays such as the MWA and LOFAR 
as well as the GMRT. 

Both difFractive and refractive effects are important at 
low frequencies. DifFractive scintillation produces struc- 
ture in both time and frequency, with the characteris- 
tic scales ^100 s in time and ~^100kHz in frequency for 
observations made at ~300-600 MHz and for DMs S 50 
pccm-3(e.g. Gupta et al. 1994; Bhat et al. 1998). As 
difFractive time scales are typically longer than ^seconds, 
an apparent brightening or dimming of signals may arise in 
cases where difFractive bandwidth (vd) is oF the order of, or 
larger than, the recording bandwidth (lyj^zAv). For dis- 
tant sources, ^ ^t^j thus signal detection will be mini- 
mally affected. RcFractive scintillation, on the other hand, 
leads to slow flux modulation on time scales of ^days to 
weeks or longer (e.g. Gupta et al. 1993; Bhat et al. 1999). 

Regardless oF their impact on signal detectability, prop- 
agation effects can potentially serve as a useFul discrimi- 
nator From local RFI. While it may be possible For certain 
types oF RFI to mimic one or more propagation efFects, 
it is unlikely that non-astrophysical signals will emulate 
multiple efFects in a manner consistent with models oF as- 
trophysical media. 

2.3. Parameter space and search volume 

In searching For short-duration radio transients, the two 
most basic search parameters are: (i) DM, and (ii) the 
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Fig. 3. — A map of the GMRT array indicating antenna locations 
across the east, west and south arms and in the central 1 km X 1 km 
region. The offsets Ax and Ay are relative to the antenna C02, 
which is at (Ax, Ay) = (0, 0). The antennas can be grouped into 
multiple distinct sub-arrays of nearly equal sensitivities. The case 
for five sub-arrays is shown here: with three sub-arrays formed from 
antennas across the three arms, and the other two from antennas 
located in the central square. Such arrangements can yield high 
efficiencies in the identification and elimination of spurious events 
due to RFI. 




Fig. 4. — Plots of maximum distance to which transient detec- 
tions are possible (Dmax vs peak luminosity Lpk), for the GMRT's 
low frequency bands. Dmax tends to vary linearly at low luminosi- 
ties (i. e. smaller distances) when scatter broadening is negligible, 
and more slowly with increasing luminosity at large distances when 
scattering becomes prominent. While the nominal sensitivities are 
comparable for the GMRT at 325 and 610 MHz, scattering is im- 
portant even at relatively lower distances at 325 MHz. The effect 
is more pronounced at lower frequencies, resulting in significantly 
lower values of Dmax for Lpj; and hence smaller search volumes for 
detectable signals. 

duration oF the signal. The latter is typically quantified 
as the effective pulse width, Wp . Here we briefly discuss 
the search parameter space, particularly in terms oF limi- 
tations imposed by dispersion, scattering, detection sensi- 
tivity, and search volume. 

As discussed above, dispersion delays can be substantial, 
even at moderate DMs, For low radio Frequencies; e. g. a 
pulse with Wint = 1 ms and DM = 10 pc cm"-^ will be 
smeared over ~100ms in observations with Av = 32 MHz 
centered at 300 MHz. While it is generally advisable to 
search out to very large DMs, in practice For searches 
within or near the Galactic plane, the maximum DM that 
can be effectively searched will likely be limited by pulse 
broadening. As the number oF trial DMs are typically de- 
termined From analytical constraints that ensure minimal 
degradation oF S/N due to DM errors, the DM spacings 
tend to be Fairly small at low Frequencies, thereby requir- 
ing a large number oF trial DMs to span a given DM range. 
This can translate to significant processing costs for low 
frequency searches. 

The vast spread in the duration of known transient phe- 
nomena make a compelling case to search in time duration 
over as wide a range as possible. In practice, the shortest 
time scale that can be effectively searched is limited to the 
sampling interval achievable with the recording instrument 
(dt) ; any signals oF W-mt ~ dt will thus be instrumentally 
broadened to dt. However, at low Frequency, pulse broad- 
ening oF astrophysical origin will likely exceed instrumental 
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Fig. 5. — A diagrammatic representation of the GMRT transient detection pipeline. Raw voltage data from each array element are captured 
and made available to a processing pipeline. Multiple (incoherent) sub-array data streams are generated, and a transient search is performed 
on each data stream. The resulting events are assimilated through the event identification and coincidence filter algorithms to select the final 
candidates. For real-time implementation (dashed lines and arrows), this information is then used to generate triggers that will alert the raw 
data capture system to record relevant raw data segments for further detailed processing and scrutiny. 



broadening, even at moderate DMs.^ At large DMs how- 
ever, pulse broadening will limit the achievable (effective) 
time resolution, and it will be difficult to detect heavily 
scattered pulses owing to S/N degradation from broaden- 
ing. The longest time durations that can be searched will 
therefore be dictated by pulse broadening. 

Propagation effects may also significantly influence 
the maximum distance to which a detection is possi- 
ble, I^max, and therefore the search volume, given by 
Kiax = {l/3)nsDl^^^, where ^ is the FoV. As D^ax 
scales as S'p^.^^^jjj, the prominent low- frequency effect is 
pulse broadening. The resultant amplitude degradation 
(S'pk cx ) leads to a lower Umax and consequently a 

smaller search volume. A detailed treatment of this effect 
and relevant survey metrics are given by Cordes (2009), 
who considers different possible survey strategies, both 
for fast and slow transient searches with the SKA. Fol- 
lowing the formalism presented there, useful plots can be 
made of Dmax vs (where Lp^ = S'pk is the peak 
luminosity) for a given choice of search parameters. As an 
illustration. Fig. 4 shows such sensitivity plots for different 
GMRT frequencies, for one specific line of sight within our 
pilot survey region {1—50°, b—3°). Here we account for 
various propagation effects, instrumental broadening, and 
the increase in sensitivity from using matched filtering. 
Reduced sensitivities (in the lower Lpk range) at 150 and 
235 MHz are due to the relatively larger sky backgrounds 
at these frequencies. As evident from these plots, I?max 
is reduced at higher Lpk (i. e. larger distances for a given 
S'pk), resulting in departures from linear trends compared 
to the lower Lpk range. This effect is obviously direction 
dependent, thus making detection rates a strong func- 
tion of sky position and frequency (e.g., Macquart 2011). 
1 

For instance, Wint ~1 p-s emission from the Crab is broadened to 
~100/is at 600MHz and ~lms at 300 MHz (Bhat et al. 2007). 



Such considerations may be used to optimize strategies for 
maximal survey yields. 

2.4. Radio frequency interference 

Impulsive and narrow-band RFI can be a major imped- 
iment in the detection of fast transients, increasing the 
number of false positives and raising the system noise. The 
issue of a false positive increase is particularly poignant 
for real-time detection schemes. With the ever-increasing 
number of (especially potentially astrophysically mimick- 
ing) RFI sources and the advent of wide-bandwidth ob- 
serving systems, it is becoming imperative to develop mit- 
igation strategies for a wide variety of RFI sources and 
signals. 

Significant resilience to RFI can be developed through 
the use of appropriate instrumentation and online identi- 
fication and excision schemes. Systems that use multi-bit 
recording can have significant dynamic range advantage 
over the traditional one- or two-bit recorders used in most 
systems. Prominent among prospective online mitigation 
schemes are those which employ median absolute devia- 
tion or spatial filtering (e.g. Roy et al. 2010; Kocz et al. 
2010) and spectral kurtosis filtering methods (Nita & Gary 
2010). Effectiveness of a given strategy will depend on the 
instrument as well as the RFI environment. 

Even with online schemes, however, a large number of 
false positives may pass through the processing pipeline, 
requiring post-detection mitigation schemes so that the 
number of candidate events that require human scrutiny 
can be reduced to a manageable level. This is especially 
critical for systems that need to function in a commensal 
mode. The long baselines of array instruments provide 
excellent capabilities here, enabling coincidence checks to 
allow identification and elimination of a large fraction of 
spurious events that are not common to all array elements. 
As outlined in §2.1, this forms the key strategy for our 
transient detection scheme for the GMRT. Coincidence fil- 
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Fig. 6. — Example plots from the GMRT transient detection pipeline. A transient pulse was detected in the survey field GTC_001.01— 1.43 
(observations at 325 MHz, i. e. beam width ^80'). The pulsar PSR J1752— 2806 is located at an offset of 42' from the phase center. The array 
was divided into four groups of roughly equal antennas, however the detection significance for the sub-arrays 2 and 3 (i. e. the south arm and 
the central square) were relatively lower, possibly due to the failure of some antennas to function at their nominally expected sensitivities. 
The top sub-panels show the de-dispersed time series (left) and frequency-time excerpts around the detected event (right). The bottom left 
sub-panels show the signal at two nearby DMs as well as at zero DM in addition to the candidate DM, while the bottom right panel is an 
enlarged version of the frequency-time plane excerpt around the signal. The detection of the signal in all four sub-arrays, its broad-band 
nature, dispersion sweep and a reduced S/N at nearby DMs and the absence of signal at DM=0 serve as multiple positive detection diagnostics. 



taring provides a simple but powerful strategy. 

Instruments with interferometric; capabilities offer yet 
another powerful means of diseriminating against RFI- 
generated transient events. The signatures of real signals 
are likely to be distinetly different in the image plane in 
comparison to those due to RFI. As such, by their very 
nature, short-duration transients may be originating from 
sources that are necessarily compact and hence will likely 
be seen as point sources in the image plane, provided an 
image can be made at sufficiently high time resolution 
(e.g., Law & Bower 2012). On the other hand, RFI bursts 
may yield various kinds of artifacts in the image plane, 
and are less likely to mimic the characteristics of point 
sources. Therefore by incorporating snap-shot imaging of 
candidate events among the event analysis strategies, fur- 
ther discrimination can be achieved against RFI sources. 

3. THE GMRT AS A TEST BED INSTRUMENT 

The GMRT has a number of inherent design features 
which can be exploited for developing and demonstrating 
useful observing strategies for time domain science appli- 
cations with next-generation instruments. In addition to 
those previously noted, the combination of moderate-sized 
paraboloids and operation at low frequencies mean rela- 
tively large fields-of-view, e.g. ~6 deg^at 150 MHz, ~1.5 
deg^ at 325 MHz. These, along with the capabilities of 
its new software backend (Roy et al. 2010), in particular 



its ability to capture raw voltage data from all 30 array 
elements and make them available; to software-based pro- 
cessing systems and pipelines, are promising for a variety 
of exploratory development. 

3.1. The GMRT software backend 

The recently developed GMRT software backend (GSB), 
built using mainly commercial, off-the-shelf (COTS) com- 
ponents, is a fully real-time 32 antennas, 32 MHz, dual- 
polarization backend. The basic requirements for the GSB 
are to support two main modes of operation : (i) a real- 
time correlator and beamformer for an array of 32 dual 
polarized signals with a maximum bandwidth of 32 MHz, 
(ii) a base-band recorder where raw voltage signals from 
all the antennas can be recorded to disks, accompanied 
by off-line correlation and beamforming. Further details 
on design and implementation are described in Roy et al. 
(2010). 

3.2. Transient exploration with the GMRT 

For transient exploration, our eventual goal is to de- 
velop and implement a system that will generate and pro- 
cess multiple incoherent array data streams in real-time 
for detecting transient candidate signals, and trigger the 
data recording system to extract and store relevant raw 
data segments for detailed offline investigations. Given the 
complexity of the problem, we adopt a two-phase strategy: 
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Fig. 7. — Another example detection from the GMRT transient pipeline, where the noise fluctuations lead to a transient pulse being detected 
at slightly different DMs and pulse widths in addition to different peak S/Ns across different sub-arrays. The true DM of the signal (i.e. a 
pulse from PSR J1752— 2806 at ^ 71' offset from the phase centre) is closer to the DM value reported by the sub-array 2, in which the pulse 
was detected at a slightly reduced S/N (by ^ 10%) at DM=51.18 pc cm""^ (i.e. the best DM as reported by other three sub-arrays). The 
widths and heights of the rectangle (red) boxes are proportional to the effective widths and peak S/Ns as found by the processing pipelines. 



the first phase involves conducting some pilot surveys and 
the development of a processing pipeline that operates on 
recorded raw voltage data. The outcomes from these are 
then used to finalise the design considerations for a real- 
time transient detection system. Among the most power- 
ful features of such an approach are: 

• exploitation of long baselines for powerful discrim- 
ination between signals of RFI origin and those 
of celestial origin via effective coincidence filtering 
and cross-checks between multiple independent data 
streams. 

• event localisation possible via high-resolution imag- 
ing (5-10 ) and/or full beam synthesis across the 
FoV, both for important integrity checks as well as 
for facilitating high-frequency and multi-wavelength 
follow-ups with other instruments. 

• the ability to form sensitive phased-array beams to- 
ward targets of interest and record baseband data 
so as to enable high time resolution studies of signal 
characteristics including coherent dedispersion and 
polarimetry. 

The modest recording bandwidth (maximum 32 MHz) 
of the current GMRT makes this a feasible exercise in 
terms of the related data rates and processing require- 
ments. Even though the GMRT's FoV is relatively small 
in comparison to those of SKA pathfinder instruments 
such as ASKAP or MeerKAT, the gain of a single antenna 
of the GMRT is almost 10 x larger than that of a single 
{d ^ 12 m) element of these next-generation arrays. Thus, 



with an aggregate effective collecting area of ~ 3% SKA, 
the GMRT makes a highly sensitive instrument for con- 
ducting useful science demonstrations. 



3.3. A pilot transient survey with the GMRT 

In order to aid the related technical development and 
demonstrate the scientific credibility of transient explo- 
ration strategies, we conducted a pilot survey for short- 
duration transients with the GMRT, covering a small area 
of the sky (-10° < / < 50° and |6| < 3°) with fairly short 
dwell times (5 minutes per pointing). The data were col- 
lected in a specially designed observing mode where raw 
voltage streams from all 30 antennas were recorded on to 
the disks. This survey was conducted at 325 and 610 MHz, 
where the GMRT offers its highest sensitivity, due to the 
large gains (G~10 K Jy~^ for the full array) and relatively 
low system temperatures (Tgys ~ 100 K) at these frequen- 
cies. The region within 1° of the plane was surveyed at 610 
MHz and the areas above and below this at 325 MHz. This 
choice was based on two main considerations: (1) to allevi- 
ate severe scattering at low frequencies, in particular very 
close to the plane and toward the Galactic centre; (2) to 
optimise the survey speed: 18 deg^ per hr at 325 MHz vs 
5 deg^ per hr at 610 MHz, so that the survey is completed 
within a modest amount of telescope time. The specific 
sky region was chosen because of its significant overlap 
with that of the Parkes Multibeam survey (thereby al- 
lowing immediate high frequency checks of any promising 
candidates), and also because it is the sky region where 
the density of known pulsars and rotating transient ob- 
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Fig. 8. — Example plots from the GMRT transient detection pipeline - the array was configured into seven different sub-arrays, each 
comprising a single antenna, thus providing a powerful coincidence filter against spurious events of RFl origin. Data were collected by 
emulating a 'survey mode', by scanning the sky region around the Crab pulsar at 0.5 deg min~^. A bright giant pulse was detected as a 
'transient' when the pulsar was within the telescope beam (half power beam width ~0.5°). The pulse is very narrow (siSO ^s), approximately 
twice the sampling resolution (30 /is) and so is seen as a thin strip on the waterfall plot. The signal peaks at DM = 56.74 pc cm"-^ , with a 
sharp decline in S/N even at small departures from the true DM; for instance, ADM/DM 0.02 results in a S/N loss of almost a factor of 
four, exemplifying the need for very short DM spacings and high time and frequency resolutions for transient searches at low radio frequencies. 
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Fig. 9. — Event rate vs. detection threshold for different sub-array 
combinations possible with the GMRT array, for a representative 
data set to characterize false positives due to signal statistics. For 
each combination, a pair of curves are shown, where the top and 
bottom (dashed and solid) ones correspond to the event rates at 
the pre- and post-coincidence filtering stage. The pre-coincidence 
(dashed) curves denote the aggregate event rates from multiple dif- 
ferent sub-arrays (i. e. the sum total of the event rates) for a given 
combination. Detection sensitivity is a strong function of the num- 
ber of antennas included in a given sub-array, while the coincidence 
power increases with the number of sub-arrays. 



jects is the largest. The relatively short dwell times mean 
that the survey is primarily sensitive to sources with fairly 
high event rates (10 hr~^ or more), such as giant-pulse 
emitters and rotating radio transients. 

In addition to the above pilot surveys, we also conducted 



observations in a number of exploratory modes. These in- 
clude modes in which the array was sub-divided into multi- 
ple different groups (i.e. sub-arrays), with all configured to 
make pointed observations of a single selected target (such 
as the Crab pulsar), as well as modes in which different 
sub-arrays were configured to point to different targets of 
choice (i.e. a variance of the Fly's Eye observing mode). 
These observations were made at a frequency of 610 MHz. 

Given our primary technical objective of developing a 
transient detection system and the required methodolo- 
gies, it was imperative to record this survey data in the 
"raw dump" mode of the GMRT software backend. This 
exploratory mode allows recording raw voltages from all 
30 elements of the array, in two polarizations, with either 
2- or 4-bit digitization. The aggregate data rate was ap- 
proximately 1 GBs~^ or 3.6 TBhr~^ (from 30 x 2 signal 
paths). For the survey parameters outlined above, this 
amounts to 42 hr of on-sky time, translating into a to- 
tal data volume of 151 TB. These data were transported 
to the Swinburne supercomputing facility where all pro- 
cessing and analyses were carried out. Transient searches 
spanned up to 1000 pc cm"^ in DM (in 1000 DM steps) 
and a maximum time scale of w500 ms. More details on 
the processing and results will be reported in a future pub- 
lication. 

4. TRANSIENT DETECTION PIPELINE 

In this section we outline a transient detection pipeline 
that we developed for offline analysis of the data from the 
pilot survey with the GMRT. We delve into various steps 
involved as we proceed from raw voltage data to the detec- 
tion and final scrutiny of candidate events. The data from 
our pilot survey runs were used as test beds for develop- 
ing the related software. In this paper we focus primarily 
on methodology and algorithms, with the implementation 
details to be reported in a separate paper. 

The basic idea involves generating multiple incoherent 
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Fig. 10. — Plots similar to Fig. 9, for a field encompassing a known pulsar. The pulsar (PSR J1752— 2806) is located near the edge of the 
Si 1.5 deg^ FoV (i.e. 70' offset from the phase center). Left: processing at the pulsar DM and over the full recorder bandwidth (A!^=16.66 
MHz); right: processing over a narrow bandwidth {Au/S) to emulate weaker pulses. The detection sensitivity in Jy (top label) is based on 
nominal gain and system temperature for the GMRT at 325 MHz. 



sub-array beams and using coincidence filter schemes for 
the rejection of false positives and RFI. The array layout 
of the GMRT and a possible scheme for sub-dividing into 
multiple distinct groups (sub-arrays) is shown in Fig. 3. 
The block diagram of the processing pipeline is depicted 
in Fig. 5. Even though our beamformer software has been 
heavily optimized for a specific architecture (a constraint 
that arises from the GSB design considerations; see Roy 
et al. (2010)), the general scheme may also be applicable 
to other array instruments such as ASKAP or MeerKAT. 

4.1. Raw data capture from array elements 

As we demonstrate in later sections, access to raw volt- 
age data from individual array elements offers a great deal 
of flexibility in terms of planning and conducting efficient 
transient searches with multi-element interferometric in- 
struments. The raw voltage data can be easily interfaced 
to an incoherent beam former that offers the choice of the 
number of sub-arrays as well as the number of elements 
per sub-array. The sensitivity and other requirements of 
transient searching outlined in § 2.1 can thus be met with 
minimal constraints. Moreover, such flexibility can also be 
exploited to adapt to the changes in the RFI environment 
across the array. 

The baseband recorder mode of the GSB can be config- 
ured for either 16 or 32 MHz bandwidth, with 4 or 2 bit 
digitisation respectively, so that the aggregate data rate 
is limited to 1 GBs~^ or 3.6 TBhr~^ (again a constraint 
imposed by the GSB design considerations). The record- 
ing cluster used in the current system comprises 16 nodes, 
each with 4 TB of data storage, thereby providing a to- 
tal data storage capacity of 64 TB, i. e. a capability that 
can cater up to 18 hr of continuous baseband recording. 
There are four 1 TB disks connected to each node, and 
data from each antenna are streamed into separate disks. 



Each recorded data buffer is accompanied with a times- 
tamp derived from the NTP server. 

Online RFI detection and excision is an important con- 
sideration for transient detection with the GMRT. Of the 
prospective schemes described in § 2.4, filtering that relies 
on median absolute deviation is the only technique that 
has been tested on the GMRT data (e.g. Roy et al. 2010). 
It is our aim to further explore the efficacies of this as well 
as other methods in the detection of short-duration tran- 
sients, and converge on a possible implementation scheme 
for the real-time version of our pipeline. We have incorpo- 
rated some rudimentary data quality checks in our current 
processing pipeline. These include basic sanity checks of 
each and every data stream for any instrumental failures 
or malfunctioning and then using this information to suit- 
ably reconfigure relevant sub-arrays as well as the related 
coincidence parameters. 

4.2. Formation of multiple incoherent beams 

The rationale for dividing the array into multiple sub- 
arrays and opting for an incoherent addition of the in- 
tensities from telescopes in each sub-array is already de- 
tailed in § 2.1 (see also Fig. 3). The simplest implementa- 
tion of this procedure involves combining the signals from 
different elements of the array after the required delay 
and broad-band fringe phase corrections and spectral de- 
composition. This is currently realised through a software 
system that operates on 2 x iVant raw data streams from 
the data acquisition cluster and 2 x iVgub sets of antenna 
"masks" (where TVgubis the number of sub-arrays), and 
performs the relevant signal additions in parallel. These 
multi-channel filterbank data streams, with a time reso- 
lution = 2 TVchan / /samp , wherc /samp IS the Nyquist sam- 
pling frequency and A^chan the number of spectral channels, 
are then converted to intensities and summed after appli- 
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Fig. 11. — Plots similar to Fig. 9, where the chosen fields correspond to "blank skies" (i.e. fields that contain no detectable known pulsars 
or rotating radio transients) and during the conditions when the RFI-generated false positives were numerous. The left panel is the results 
for the survey field GTC_352.17+1.27 when RFI was comparatively severe (in terms of the RFI-generated events), and the right panel shows 
the results for another field GTC_002.25+1.17 when RFI was relatively modest. 



cation of the suitable antenna masks. These incoherently 
added intensity data are integrated (if needed) to achieve 
the desired time resolution. For example, for processing 
the data from pilot surveys, we have configured this inco- 
herent beamformer to output data streams with 256 spec- 
tral channels across the 16 MHz bandwidth (A/ ~ 62.5 
kHz) at a time resolution of 30.72 /is. 

Higher time resolution can only be achieved at the cost 
of a reduced spectral resolution. Searches at low frequen- 
cies inherently benefit from high resolutions in both time 
and frequency, and therefore this involves trade-offs in 
terms of the maximum DM that can be searched and the 
achievable time resolution. For example, a higher spectral 
resolution (i. e. 512-channel filter bank sampled at 61.44 
/is) at 325 MHz will allow searching out to larger DMs 
while still not limiting the detectability of intrinsically nar- 
row signals such as giant pulses, as they will be scatter 
broadened to ~1 ms. 

4.3. Searching for transients 
4.3.1. Dedispersion and Detection 

Most traditional search algorithms for detecting fast 
transients operate on fast-sampled, multi-channel (filter- 
bank) data and hence involve incoherent dedispersion fol- 
lowed by searching for transient events in the resultant 
time series. Dedispersion is performed over a large num- 
ber of trial DMs (e.g. up to ^1000 pc cm~^ ) using the 
standard direct dedispersion algorithm. This is the most 
computationally intensive part of our processing pipeline. 
As dispersion delays can be substantial at the GMRT's 
frequencies (cf. § 2.2; Atdm oc for small Az/), a large 
number of trial DMs are required to span such a large DM 
range, even when observations are made over moderate 
bandwidths of 16-32 MHz. 

Our current processing pipeline makes use of the dedis- 



persion software that was developed for the ongoing high 
time resolution survey at Parkes (Keith et al. 2010; Burke- 
Spolaor et al. 2011). This software takes advantage of 
modern multi-core processors that allow multi-threaded 
software to achieve significant speed-ups in computation. 
While the Parkes survey decimates the data to two bits 
per sample, our processing pipeline is designed to oper- 
ate on 16-bit data samples which provides a much higher 
dynamic range, and also ensures immunity against possi- 
ble signal saturation from powerful RFI bursts. Following 
the convention in pulsar searches, spacings between DM 
values are determined by an analytic constraint on the 
signal-smearing due to incorrect trial DM. A GPU imple- 
mentation of this dedispersion software has recently been 
developed (Barsdell et al. 2010) and is being integrated 
into the real-time version of our processing pipeline which 
will be described in a forthcoming paper. 

Our approach of employing multiple sub-arrays for tran- 
sient detection with the GMRT means that the dedisper- 
sion stage will result in iVgub x .^dm time series to search 
for, where N^m is the total number of DM trials. As we 
demonstrate in later sections, four sub-arrays are optimal 
for transient detection with the GMRT. For our observ- 
ing frequencies and recording bandwidth, ensuring a sig- 
nal degradation (from dispersive smearing due to incorrect 
DM) of no more than 1% will necessitate 10^ time series 
for each sub-array. 

Detection of transient events essentially involves the 
identification of data samples, or groups of samples, that 
are above a set threshold (e.g. 5a) in the dedispersed time 
series. Matched filtering, as approximated through a range 
of box car widths, is the commonly employed detection 
technique (e.g. Cordes & McLaughlin 2003). This simple, 
yet effective, methodology has been extensively used in a 
number of ongoing searches based at Parkes and Arecibo 
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Fig. 12. — An illustrative example to highlight the power of multiple sub-arrays and coincidence filter for the detection of real astronomical 
transient events whose peak amplitudes may be well below typically used 5-6(t thresholds. Data from a field encompassing a known pulsar 
(but located at a large offset from the phase center) are processed over a small fraction of the recording bandwidth (Av/8) to mimic such 
a scenario. The top panel shows the raw time series (at the original time resolution of 30.72 ^s), followed by the detections from a single 
incoherent sum of all 30 antennas for two set thresholds of 6a and 3.5(t (i.e. second panel from the top); the four panels that follow (from 
third to sixth) show the detections (at a much lower l.Scr) from individual sub-arrays when the array is sub-divided into four groups. The 
bottom panel is the output from the coincidence filter. The large tics in the second and bottom panels are the reference markers for the 
expected locations of pulses. Of the 11 pulses within this short data block (6 s), all but the faintest pulse have been successfully detected. 
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Fig. 13. — Another illustration to highlight the advantages of multiple sub-arrays and coincidence for transient detection. The data block is 
the same as that was shown in Fig. 12 but processed for different possible sub-array groupings. The top panel shows the raw data time series, 
followed by the detections from a full incoherent 30-antenna array (i.e. second panel from the top), while the bottom four panels correspond 
to cases where the number of sub-arrays range from 2 to 5. The large tics in the bottom five panels are the reference markers that indicate 
the expected locations of pulses. The dotted lines correspond to the detection thresholds for each of the different sub-array combinations. 
The detection efficiency progressively improves from 2 to 4 sub-arrays compared to a single incoherent sum, whereas 5 sub-arrays appears 
to be less than optimal choice. More quantitative analysis based on the processing of full data length (from this pointing) is summarized in 
Table 1. 



as well as other instruments around the world (e.g. Deneva 
et al. 2009; Burke-Spolaor et al. 2011; Bhat et al. 2011). 
Alternate techniques, such as those based on quadratic dis- 
criminants and other statistics, have also being explored 
and demonstrated to a certain extent (e.g. Thompson 
et al. 2011; Fridman 2010; Spitler et al. 2012), though 
their efficacies as viable alternatives in large-scale tran- 
sient searches have not yet been thoroughly tested. The 
present version of our pipeline therefore employs matched 
filtering as the primary detection algorithm. 

Matched filtering is approximated by progressive smooth- 
ing of time series data over a range of box car filters of 
widths 2" samples, followed by application of threshold 
tests, each time recording the event amplitude, the time 
of occurrence and duration (e.g. Deneva et al. 2009; Burke- 
Spolaor et al. 2011). While relatively simple and easy to 
implement, this technique of matched filtering has some 
shortcomings; for instance, powerful RFI bursts that oc- 
cur over relatively long durations (i. e. Wp ^ dt) will be 



detected as an overwhelmingly large number of events. 
Furthermore, as the pulse templates have discrete widths 
of 2" samples by design of the algorithm, this means re- 
duced sensitivity to events whose widths are intermediate 
to those of the chosen box car filters. As discussed by 
Deneva et al. (2009) and Bhat et al. (2011), alternate 
methods, such as those based on time domain clustering 
along the lines of the friends- of- friends logic may help 
alleviate some of these demerits of matched filtering. 

4.3.2. Initial scrutiny of detected events 

An important consequence of the above described search 
strategy, which involves many trial DMs and box car 
widths, is that each single event will be detected as multi- 
ple events in the DM-time parameter space depending on 
the signal strength, duration and DM. In order to identify 
and associate multiple related events arising from a cer- 
tain transient pulse, we employ algorithms along the lines 
of friends-of- friends logic that is very similar to those de- 
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Fig. 14. — Diagrammatic representation of the event analysis 
pipeline for a detailed scrutiny of the candidate events that emerge 
from the GMRT transient detection pipeline. Depending on the 
characteristics of the candidate event, one or more analyses or scru- 
tinies may be possible, e.g. processing of the raw data segments to 
generate the visibilities (for making a snapshot image of the FoV); 
forming a sensitive phased-array beam toward the target (to enable 
a high quality detection), followed by a suitable dispersion removal 
process and the signal detection. Final event scrutiny is then per- 
formed to arrive at a list of candidates for further follow-ups. 



scribed in Burke-Spolaor et al. (2011). This essentially 
involves performing an association of events in time, DM 
and the matched filter width Wp , effectively identifying 
the groups of related events in the parameter space. In 
practice, this may be realized in two steps; first, for a 
given DM, association of related events is performed in 
time; specifically starting with the widest pulse and look- 
ing for pulses which overlap and, as each new pulse is 
associated, the search window is extended to include the 
net time range. The criterion here is that the peak of the 
second pulse overlaps with the full width of the first. A 
similar association is subsequently performed in DM, by 
effectively checking for contiguous events in DM and asso- 
ciating any events with the S/N characteristics that may 
follow likely astronomical signals (e.g. peaking at true 
DM and lower S /N with an increase in departure from the 
true DM). The procedure also accounts for possible time 
delays or advances expected due to the DM offset, thus 
ensuring that multiple real events of different DMs are de- 
tected as separate events, whereas multiple events due to 
a given RFI burst (often spanning the full DM range) are 
still detected as a single event. In the end, multiple points 
in the DM-time-W^p parameter space which are related are 
counted as a single event. 

4.4. Coincidence filtering and elimination of false 
positives 

The main goal of coincidence filtering is the removal of 
false positives due to noise and RFI, thereby improving the 
efficiency to discriminate genuine signals of astronomical 
origin. In ideal conditions, when the signals are sufficiently 
strong to allow clear detections, this can be achieved with 
strict simultaneity checks in terms of the characteristics 
of the detected events. An example detection of this kind 
is shown in Fig. 6. However, real-world considerations 
necessitate a more flexible approach, especially when the 
detected signals are relatively weak (i. e. near the detec- 
tion thresholds) and the sensitivity of the sub-arrays is 
not guaranteed to be identical. An example of these kind 
of effects can be seen in Fig. 12, which shows the detections 
from 4 sub-arrays of the GMRT, for a few successive pulses 
of a relatively weak pulsar where the detections are barely 
above the acceptable S/N threshold. As can be seen, sub- 
array 4 has a somewhat lower sensitivity than the other 3 
sub-arrays, and even otherwise, the detectability of indi- 
vidual pulses does vary across the sub-arrays, presumably 
due to noise fluctuations. In such cases, it is possible that 
a given transient pulse may be detected at slightly differ- 
ent DMs, pulse widths (durations) or times of occurrence 
by different sub-arrays (an example for which is shown in 
Fig. 7), and an efficient recovery mechanism needs to take 
this into account. 

In order to account for such effects and their potential 
impact on the detectability of genuine astrophysical sig- 
nals, we have devised a somewhat flexible coincidence logic 
for the GMRT transient pipeline. Basically, each event is 
characterized by its basic properties such as the arrival 
time (t), duration (w), DM and the peak S/N. The events 
from different sub-arrays are cross-checked for coincidence 
criteria defined in terms of these parameters. Two events 
Ei{ti,wi, DMi, s/ni) and £'2(^2, W2, DMi, 5/77,2) from the 
sub-arrays 1 and 2 are treated to be coincident provided 
(i) overlap in their times of occurrence is within a set 
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range, Ttoi', (ii) the difFercncc in DMs is within a set range, 
DMtoi; and (iii) the difference in the peak S/Ns is within 
a set range S/Ntoi- That is, their characteristics need to 
be such that (i) SDM < DMu,u (ii) 5{S/N) < {S/N)toi, 
(iii) AT/wi > Ttoi and (iv) AT/wa > Ttoi, where SDM 
and 6{S/N) are the fractional differences in the DMs and 
S/Ns, and AT is the overlap in the time ranges, as de- 
termined by the respective time ranges, {ti,ti +101) and 
(^2,^2 + ^^2) for the events Ei and i?2- For coincidence 
between three or more sub-arrays, each event is checked 
against events from every other sub-arrays, beginning with 
the highest S/N event. As we illustrate in later sections, 
such a scheme is particularly important for the detection 
of weaker signals. Specifically, it increases the prospects of 
detecting real (weaker) signals, while limiting the number 
of false positives. 

The parameters Ttoi, DMtoi and S/Ntoi thus determine 
the "stringency" of the coincidence logic. For example, 
insisting for higher time overlaps (e.g. Ttoi ~ 50%) be- 
tween the detected events imply a stringent coincidence 
logic, whereas allowing for smaller time overlaps (e.g. T^oi 
;S 20%) woTild rcsTilt in a coincidence that is relatively more 
lenient. The parameters DMtoi and S/Ntoi essentially help 
ensure that only those events of similar characteristics (in 
DM and brightness) have chances of passing through the 
coincidence filter. The choice of the above parameters 
also has implications in terms of the rates of false posi- 
tives, since a more lenient coincidence logic means rela- 
tively higher rates of false positives. Similarly, a higher 
stringency in the coincidence logic may potentially result 
in filtering out real astronomical signals that are near the 
detection thresholds. Events that pass the set coincidence 
criteria are subjected to a further detailed scrutiny while 
those that fail are rejected from the analysis. While there 
may be various factors that influence optimal values of 
these parameters, RFI can also be expected to play a sig- 
nificant role, in particular for the GMRT given its location 
in a relatively RFI-prone environment. 

On the basis of a preliminary analysis of our survey data 
(i. e. 130 fields covering a ^200 deg^ of the sky) at 325 
MHz, it appears that representative values may be ^5 
10% for DMtoi, and ~50% for {S/N)toi and Ttoi- Specifi- 
cally, we note that these are derived particularly from the 
data on two specific survey fields (GTC_002.52-1.64 and 
GTC_001.01-1.43) that contained a known pulsar, but at 
relatively large offsets of 71' and 42' respectively from the 
beam phase center (i. e. near the edge and well outside the 
half power beam). These were processed for various pos- 
sible combinations of the parameters, and the tolerance 
settings that resulted in maximal number of real pulse 
detections (and minimal number of false positives) were 
treated as optimal choices. These were subsequently ver- 
ified using data from the pointings at the beginning of 
the survey observations (when the pulsar would be at the 
phase centre). 

4.5. Examples of transient detections 

Figure 6 shows an example candidate event detected 
in our survey observations (GTC_001.01-1.43) that con- 
tained a known pulsar at an offset of 1.2 deg from the 
phase center. This is from observations made at 325 MHz 
(i. e. a FoV Ril.5 dcg^ or a half power beam width ~ 84'). 
These basic diagnostic plots illustrate a number of sig- 



nal characteristics expected of astrophysical signals. For 
instance, the dedispersed time series and frequency-time 
plots (top panels) provide immediate assessments of coin- 
cidence of signal detection in multiple different s\ib-arrays. 
Other important signatures include a dispersion sweep in 
the time-frequency plane and the change in signal strength 
versus DM, which is shown as the dedispersed time se- 
ries at the candidate DM as well as at two nearby DMs 
along with that at DM=0 (bottom panels) . Our processing 
pipeline also records additional information such as signal 
strength vs. DM (for optimum Wp ) and signal strength vs 
Wp (for optimal DM). A basic scrutiny along these lines 
can be employed in order to arrive at a list of candidates 
that may require further detailed investigations. 

Fig. 8 shows another example from our transient detec- 
tion pipeline. These observations were made in a special 
mode where seven of the 30 antennas were pointed to the 
Crab pulsar, thus emulating 7 sub-arrays, each comprising 
a single antenna. This provides a powerful coincidence fil- 
ter against spurious events of RFI origin. The data were 
collected in a 'survey mode', by making the telescopes 
scan the sky region around the Crab pulsar at a rate of 
0.5 deg min^^. A bright giant pulse was thus detected 
as a 'transient' when the pulsar was within the telescope 
beam (the half power beam width at 610 MHz is ~0.5°). 
This example highlights the need to employ very short DM 
spacings as well as high time and freqiiency resolutions in 
order to retain sensitivity out to durations as short as tens 
of fis, which is possible with our transient pipeline. 

5. APPLICATIONS TO REAL DATA 

In § 2.1 we discussed the advantages of using multiple 
sub-arrays and coincidence for transient detection and its 
impact on the detection sensitivity to transient signals. 
In § 4.4 we delved into the details of practical implemen- 
tation of our coincidence detection logic. In this section 
we present some examples to illustrate the effectiveness of 
such a scheme through its applications to real data ob- 
tained from our pilot surveys. Specifically, we highlight 
(i) the reduction of false positives for the cases of (a) 
pure noise, and (b) RFI contamination (§5.1), and (ii) the 
power of coincidence filtering in facilitating the detection 
of weaker astronomical signals (§5.2). 

5.1. Reduction of false positives 

The basic underlying concept here is that the vast ma- 
jority of events generated due to noise fluctations and RFI 
signals will be uncorrelated between different sub-arrays. 
In order to investigate this, we compare the pre- and post- 
coincidence filtering event rates as a function of number of 
sub-arrays and detection threshold. We consider various 
possible combinations for the GMRT array, ranging from 2 
to 5 sub-arrays, and process the data over a wide range of 
detection thresholds, down to ^0.5(t at the single antenna 
level. This analysis becomes fairly cumbersome given the 
complexity of the number of sub-array combinations and 
running the pipeline at the very low threshold values that 
we use. We therefore focus on some select case studies as 
described below. 

5.1.1. Case study 1: A blank field in the absence of RFI 

This is probably the simplest case, the results from 
which can be directly compared against the predictions 
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Fig. 15. — Images of a single pulse from the pulsar J1752— 2806. The left panel shows the "dirty" image, the sidelobe response of the 
interferometer ean be elearly seen. The right panel shows the image after cleaning. The contour levels are logarithmically spaced between 20 
and 270 mjy. The synthesized beam is shown in the top left corner and is 59 x 10 in size. The flagging and calibration was done using the 
FLAGCAL pipeline, and the image was made using the AIPS task IMAGR. 



based on the theoretieal analysis presented in §2.1. To 
emulate an "absenee of RFI" , we performed the related 
analysis on a data set that is virtually devoid of any no- 
ticeable RFI. To further reduce the effect of interference 
signals and keep matters simple, the data were processed 
at a single, large DM value (200 pc cm~^ ). Furthermore, 
the coincidence logic described earlier in § 4.4 was heavily 
simplified in order to match the assumptions made in the 
theoretical analysis (for instance, a uniform detection sen- 
sitivity across all different sub-arrays). Specifically, the 
tolerances in terms of DMs, arrival times and S/N ra- 
tios were set to zero, which implies the maximum possible 
stringency achievable with the coincidence logic. 

The results from the analysis are shown in Fig. 9, where 
we plot both the pre- and post- coincidence filtering event 
rates for detection thresholds down to ^0.5o". The sub- 
array groupings of antennas have been chosen such that 
maximal resilience against localized RFI is achievable; e.g. 
in the case of four sub-arrays, 3 of them are formed from 
7-8 antennas that are located along the east, west and 
south arms, whereas the fourth one is comprised of an- 
tennas from the central 1 km x 1 km area. The results 
for different sub-arrays have been scaled to equivalent sin- 
gle antenna thresholds by applying the theoretically ex- 
pected scaling.^ The top panel denotes the thresholds in 
units of Jy, assuming nominal sensitivity parameters of 
the GMRT (cf. Eqn. 2). The 10 to 100 times improve- 
ment seen in the post-coincidence event rates compared to 
a single 30-antenna sub-array (i. e. the full GMRT array) 

2 

Under the assumption of identical gains and system temperatures for 
individual antennas, the detection threshold for a sub-array of nant 
antennas, Ugub = f/V^ant ' where a denotes the detection threshold 
for a single antenna. 



is in rough agreement with the theoretical predictions (cf. 
Fig. 1, where the region of interest are the first 5 curves in 
the top left hand corner of the figure). There arc however 
some discrepancies; for instance, the results for the 3 sub- 
array case are only marginally better compared to those 
for 2 stib-arrays. Moreover, little improvement is seen in 
going from 4 to 5 sub-arrays. These discrepancies may be 
due to some faint RFI signals that are common to different 
sub-arrays, or perhaps because of detection thresholds not 
scaling as theoretically expected in practice. Besides this, 
the results are in accordance with expectations from the 
theoretical analysis, thereby ratifying our basic principle 
for splitting the array into incoherent sub-arrays. 

5.1.2. Case study 2: A blank field in the presence of RFI 

The GMRT's proximity to dcnscily poptilatcd regions 
and operation at low frequencies pose major challenges in 
terms of corruption due to RFI. The main sources of RFI 
include power lines, transmitters, TV boosters and cell 
phone towers. Some of these are highlighted in (Paciga 
ct al. 2011), which presents a detailed characterization of 
RFI sources seen in the GMRT's ~150 MHz band. A 
somewhat similar situation prevails at the other low fre- 
quency bands of the GMRT, even though it is true that 
most of the wideband, impulsive RFI sources have spec- 
tra that become weaker at higher frequencies. The pre- 
liminary processing of our pilot survey data suggests that 
at least some modest fraction of our survey data at 325 
MHz is corrupted by RFI. The presence of multiple bright 
RFI sources, and short to moderate baselines of the array 
(-^100 m to -^25 km), lead to some interesting challenges 
in terms of gaining immunity against resultant false posi- 
tives. 

Different survey scans with varying degrees of RFI were 
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identified and analyzed in order to investigate the im- 
provement in terms of the pre- and post-coincidence event 
rates. One specific example is shown in Fig. 11. The com- 
binations of 4 or 5 sub-arrays retain the discriminatory 
power in terms of immunity from RFI false positives, re- 
sulting in a significant improvement in terms of the rates 
of false positives compared to the reference case of a full 
incoherent array. However, the results for 2 and 3 sub- 
arrays are somewhat puzzling, particularly with regard to 
the improvement seen in going from 2 to 3 sub-arrays. 
In fact the 3 sub-array combination seems to be virtually 
ineffective when RFI gets severe (see left panel). This 
may suggest correlated RFI events between 3 sub-arrays; 
e.g. powerful RFI sources located near the central square 
region, the antennas from which are roughly evenly split 
between the 3 sub-arrays. While we have attempted to 
further investigate this puzzling observation by trialing 
different ways to form the sub-arrays (e.g. based on the 
proximity to the central electronics) and also by process- 
ing several different data sets, the results have not been 
quite conclusive. We therefore speculate this is likely to 
be an intrinsic feature of the GMRT array. 

5.2. Detection of real astronomical signals 

As outlined in § 2.1, an improved efficiency in terms 
of the rates of false positives can only be achieved at the 
cost of reduced sensitivities at individual sub-array lev- 
els. While our approach to divide the array into multiple 
groups of incoherent sums seems hke a reasonable trade- 
off, the net result is reduced detection sensitivities, partic- 
ularly to the detection of weaker signals. For instance, a 
6(7 transient pulse from a single 30-antenna array will be 
detected as a 3cr event when the array is sub-divided into 
four groups. An equally important aspect therefore is the 
efficiency that may be achievable in the detection of such 
weak (but real) astronomical signals. Lowering the detec- 
tion thresholds to ~2-3cr, in principle, should result in the 
detection of such signals, but this can be achieved only 
at the expense of a much larger number of false positives 
(mostly from signal statistics and perhaps some from RFI 
signals). As emphasised earlier, an underlying assumption 
is that the vast majority of these may be uncorrelated and 
therefore will be excised by coincidence filtering. In order 
to illustrate this, we conducted some specific analysis on 
suitably selected survey scans, the results from which are 
summarised below. 

5.2.1. Case study: A field encompassing a known pulsar 

The detection of genuine astrophysieal signals is illus- 
trated through an example survey field that contains a 
known pulsar (PSR J1752-2806) at an offset of 1.2° from 
the phase center (i.e. « 1.7x the half power beam width 
at 325 MHz). The strength of the signal is such that, at 
the incoherent array output, the brightest pulses from the 
pulsar mimic intermittent transient signals, thus provid- 
ing a very good test case. We processed these data at the 
pulsar's DM (50.372 pc cm~^ ) and over the full recording 
bandwidth (Ai/ = 16.66 MHz) as well as over a much re- 
duced {Av/S w 2 MHz) bandwidth; the latter was done in 
order to emulate even weaker pulses. The resultant plots of 
pre- and post- coincidence filtering event rates are shown 
in Fig. 10. 



A quick inspection of these figures helps draw some use- 
ful conclusions. For example, in the full bandwidth case 
(left panel), all post-coincidence detection curves tend to 
merge near and above ^1 cr, thus approaching the ex- 
pected pulse rate «1.8 s^^. This may be interpreted as all 
genuine pulses that are bright enough (i.e. above the set 
detection thresholds) are detectable, thus providing cru- 
cial integrity checks of our processing pipeline. Secondly, 
for the reduced bandwidth case, where we emulate weaker 
pulses (i.e. S/N « 3 times lower), while the event rates 
for 2 or 3 sub-arrays at lower thresholds (;S 1 a) are still 
dominated by false positives, a substantial improvement is 
seen on going to a larger number of sub-arrays. Overall 
this makes quite a compelling case to go for at least 4 sub- 
arrays. Furthermore, the improvement is only marginal on 
going from 4 to 5 sub-arrays, which suggests that 4 sub- 
arrays may be an optimal choice for transient detection 
with the GMRT. 

5.2.2. Detection of weaker signals 

Fig. 12 provides a useful illustration for the case of four 

sub-arrays. We have taken a short stretch of data from the 
above field that contains a known pulsar and presented a 
time domain analysis. Of the 11 real pulses (transient sig- 
nals) present in this short data block, only 6 are detectable 
with the sensitivity of the 30-antenna incoherent array and 
a 6(7 detection threshold (top panel). In order to ensure 
the detection of all pulses, it turns out that the detection 
threshold needs to be lowered to 3.5cr. As seen from the 
figure, this also results in many more false positives along 
side. A 3.5cr threshold scales down to ~1.8it when the ar- 
ray is sub-divided into 4 distinct groups. Processing down 
to such low thresholds will obviously result in numerous 
false positives, as can be expected from signal statistics. 
However, as illustrated through this figure, virtually all of 
them are excised by the coincidence filtering, resulting in 
a very small number of false positives in the end. In fact, 
a quick glance of the figure (lower most panel) reveals that 
all but the faintest pulse is detectable, along side a rela- 
tively smaller number of false positives compared to that 
of the full 30-antenna array. This clearly illustrates that 
our basic theoretical ideas proposed in § 2.1 do work in 
practice in real data. 

5.2.3. Optimal number of sub-arrays 

Fig. 13 shows the net improvement achievable for dif- 
ferent combinations of sub-arrays, i. e. from 2 to 5, com- 
pared to a single incoherent sum of all 30 antennas as 
reference (top panel), for the data set used in the analy- 
sis above. While there is a progressive improvement from 
2 to 4 sub-arrays, the case for 5 sub-arrays is seen to be 
far less appealing compared to 4 sub-arrays. This inabil- 
ity of 5 sub-arrays to win over 4 sub-arrays may perhaps 
be due to possible departures in the detection sensitivity 
from the theoretically expected y/nant for incoherent sum, 
or because the algorithm becomes less effective due to a 
larger number of false positives at such very low (1.5cr) 
thresholds. In short, 4 sub-arrays seems to be an optimal 
strategy for transient detection with the GMRT. 

In order to quantify the level of improvements as well 
as to obtain more meaningful statistics, we processed the 
full duration of the scan (300 seconds) and conducted sim- 
ilar analysis, the summary of which is shown in Table 1. 
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The improvements are tabulated both in terms of detec- 
tions of real pulses as well as the number of false positives. 
The data were split into two halves for this analysis, as of- 
ten the number of detections will critically depend on the 
modulation of pulse amplitudes (due to intrinsic and/or 
scintillation effects). The first four columns of the table 
are self-explanatory; column 5 is the fraction of the num- 
ber of real events found (normalized to the total number 
of real pulses that arc present in the data), and the col- 
umn 6 the ratio of the number of false positives compared 
to that of the full 30-antenna incoherent array. Overall, 
the results are consistent between the two data sets, par- 
ticularly the improvement factor in terms of the number 
of false positives. It is also evident that of all the combi- 
nations, 4 sub-arrays yields the best improvement, which 
supports our finding from the example illustrated through 
Fig. 13. 

6. EVENT ANALYSIS PIPELINE 

Promising candidate events that emerge from our pro- 
cessing pipeline are subjected to detailed scrutinies, a basic 
scheme for which is shown in Fig. 14. Among the salient 
features are the integration of a calibration and imaging 
pipeline for potential on-sky localization of the event and 
the ability to phase up the full array toward the target 
position to enable a high-quality signal detection and con- 
firmation. This necessitates keeping track of the calibra- 
tor observations and processing them routinely to solve 
for the complex gains of the array elements. For local- 
ization via imaging, the raw data segments of an event 
are first correlated to generate visibility data. Perform- 
ing phase-coherent dedispersion prior to correlation can 
greatly increase the chances of localization, especially for 
short-duration signals at moderate or high DMs (e.g. gi- 
ant pulses). In the event that a clear detection in imag- 
ing and accurate localization (~5-10") is possible, a sen- 
sitive phased-array beam can be formed toward the tar- 
get position to enable a high-quality signal detection and 
characterisation^. Depending on the characteristics of the 
signal (e.g. time duration, temporal structure and DM), 
the phased-array data can then be processed for phase- 
coherent or incoherent dispersion removal followed by de- 
tection and subsequent analysis and further checks. As 
well as enabling crucial integrity checks of the detected 
events, such a powerful methodology offers the advantages 
of obtaining additional information - such as high time 
resolution studies, accurate DM estimation and localiza- 
tion of the target for any genuine signals that may need 
further detailed follow-ups. Some of these possibilities are 
further elaborated and illustrated through suitable exam- 
ples in the subsequent sections. 

6.1. Imaging pipeline 

As described above, one of the major advantages of tran- 
sient detection via interferometric arrays is the possibility 

of localisation of the source. This is most straightforwardly 
done by making an image of the transient. As described 
elsewhere in this paper, once a particular data stretch has 
been identified as containing a possible transient, the volt- 

3 

As the localisation radius scales as (S/N)~^ for unresolved point 
sources, accuracies at the level of an arc second are achievable even 
in the case of marginal (~5-10<t) detections at 325 and 610 MHz. 



age data from each antenna for that corresponding time 
interval is saved. This data is then correlated (using es- 
sentially the same correlation routines as used in the real 
time system) to create a set of visibilities. The process of 
making an image from these visibilities is well understood 
(see e.g. Thompson et al. (2001)), and there exist sev- 
eral software packages aimed at doing this problem (e.g. 
AIPS, Miriad, CASA). The principal steps are (i) iden- 
tifying and flagging out erroneous visibilities, e.g. those 
affected by radio frequency interference, which can be sig- 
nificant at most of the frequencies at which the GMRT op- 
erates, (ii) correcting for the complex gain (including the 
atmospheric/ionospheric gain) and (iii) imaging and de- 
convolution. The first two of these steps have been incor- 
porated into a pipeline FLAGCAL (Prasad & Chengalur 
2012), while the imaging and deconvolution is currently 
done using one of the standard packages (AIPS in this 
instance) 

6.1.1. Identification and flagging of corrupted visibilities 

The most common type of strong RFI at the GMRT 

site has a small occupancy in the time-frequency space, 
i. e. is either limited in time, or in frequency, or in both. 
FLAGCAL uses this fact to identify corrupted visibilities. 
Essentially robust statistics (across time, frequency and 
baselines) of the visibilities are derived, and then outliers 
with respect to these statistics are identified and flagged 
out. Slow variations in the visibilities are accounted for 
by allowing for a (user definable) smoothing in the time 
frequency plane before computation of the statistics and 
identification of the outliers. In calibrator scans, one would 
expect that, in the absence of any corruption, the phase 
of the visibility would be nearly constant on the typical 
timescale of a calibration observation (i. e. of the order of a 
few minutes). This is also used to identiiy corrupted data. 
RFI often affects contiguous sets of channels and or time 
ranges, and hence tw'o passes arc; made through the data, 
one of which identifies corrupted visibilities on the basis of 
the robust statistics, and the other that marginalises over 
the flagged data to identify frequency channels, baselines 
and/or antennas for which the data has been corrupted. 
The output of this stage of the pipeline is a set of visibilities 
in which all data identified as being corrupted has been 
flagged out. Since the determination of robust statistics 
is computationally intensive, FLAGCAL implements this 
using OpenMPI, resulting in significant speed ups in multi- 
core machines. 

6.1.2. Calibration 

At any instant, an N-element interferometric array 
(i. e. one in which there are N unknown complex instru- 
mental gains) measures N{N — l)/2 complex visibilities. 
This makes the problem of determining the antenna gains 
from observations of a calibrator source with known vis- 
ibilities (e.g. a point source at the phase centre) over 
determined. Iterative schemes for determining the least 
squares solution for this problem have been described by 
e.g. Bhatnagar (2001). FLAGCAL implements this iter- 
ative scheme to determine the complex antenna gains. 
In general there are several kinds of calibrations that 
can be performed, viz. "flux calibration" for determin- 
ing the absolute flux level; "bandpass calibration" for 
the spectral response; and "phase calibration" for the 



Transient detection with the GMRT 



19 





17"54"' \7"52" 



Right Ascension (J2000) 



Fig. 16. — Image made from the correlated visibility data for the pointing GTC_002.52— 1.64. The total time interval over which the 
visibilities are calculated is 0.8 seconds. The pulsar J1752— 2806 was detected in this pointing by the transient search pipeline. The image 
shows a number of sources in the field, and the source corresponding to J1752— 2806 is marked with a box. The image with the large field- 
of-view has a resolution of 36" x 20". A zoomed-in image of the pulsar emission with a resolution of 16.5" x 8.5" is also shown. The contour 
levels in the zoomed in image start at 41 mjy and are in steps of 9 mjy. The position and flux measured from this image is RA = 17 52 
58.746 ± 0.024, DEC = -28 06 36.09 ± 0.41 and 80 ± 9 mjy. 



combination of the atmospheric and instrumental gains. 
FLAGCAL implements all of these calibrations and also 
interpolates the final corrections on to the target visibil- 
ities. It also allows interpolation of the flags from the 
calibration scans onto the target visibilities. This is use- 
ful in (the commonly encountered) situation where there 
is persistent RFI affecting some given spectral channels, 
antennas or baseline combinations. 

6.1.3. Imaging and Deconvolution 

The flagged and calibrated visibilities computed by 
FLAGCAL are written out as a FITS flle. This allows easy 
processing using standard imaging packages. This stage of 
the processing can easily be automated. The GMRT ar- 
ray configuration has been designed to give a fairly good 
snapshot UV coverage at most declinations (see Swarup et 
al. (1991)) allowing for a good localisation of the source. 
As described above, this localisation is also important in 
determining that the transient emission that was detected 
indeed arises from the sky. In situations where there is 
sufficient signal-to-noise ratio, further confirmation of this 
comes from confirming that self calibration improves the 
signal-to-noise ratio of the image. 

6.1.4. Case Study PSR J 1752- 2806 

The pulsar J1752— 2806 was targeted as part of the 
test observations on 2010 February 20. Calibration 
was done using the data from the source 1830—360 
from the VLA catalog. The data was run through the 
FLAGCAL pipeline, and then imaged using the AIPS task 



IMAGR. The image was produced by excluding the edge 
channels, as well as some central channels which were 
badly affected by RFI and for which most of the data 
was fiagged out. The final bandwidth used to make the 
image was ^ 14.7 MHz (i. e. «90% of the recording band- 
width), and the noise level in the image is ~ 12 mJy. The 
pulsar is clearly detected at a fiux level of ~ 220 mJy. 
A strong confirmation that the emission arises from the 
sky (and is not some chance RFI) comes from the fact 
that self calibration significantly increased the peak flux 
of the source; specifically, the fiux after one round of phase 
only self calibration is ~ 280 mJy. A similar confirmation 
comes from the fact that cleaning (which was done using 
the AIPS task IMAGR) substantially reduces the sidelobe 
levels. Fig. 15 shows the images produced before and after 
cleaning. 

6.2. Application for event localization 

We present another case study where a transient pulse 
was blindly detected in our search pipeline. The scenario 
is the same as that outlined in § 5.2.1, i.e. a survey field 
that contained a known pulsar but at a large offset of «71' 
from the phase centre. This large offset (1.7 x the nominal 
half power beam width) means a source location near the 
edge of the beam and as a result the pulsar will effectively 
be detected as an intermittently emitting transient source. 
The pulse was detected as a 5a event in the search pipeline 
and the signal characteristics (i.e. arrival time and DM) 
were then used to determine and extract the corresponding 
raw data segments from all 30 antennas. These data were 
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Fig. 17. — Example data illustrating the detection of a candidate signal and its subsequent confirmation and verification. The transient 
pulse (from a known pulsar that was within the FoV) was first detected at a relatively low significance (Ri5cr at RiO.25 ms resolution) and 
was treated as a candidate as it passed the set coincidence criteria. A sensitive phased array was then formed by phasing up all antennas 
with good calibration solutions toward the source position as determined by the on-sky imaging pipeline (see Fig. 16). The improved signal 
detection from this phased-array data is shown in the bottom panels. The peak signal-to-noise ratio improved by a factor of six, which is 
approximately 70% of the expected level of improvement given only 21 of the 30 antennas were phased up for final detection and verification. 



then correlated to produce the visibihties which was sub- 
sequently imaged using the procedures described in § 6.1. 
Observations of 1830—360 that was recorded 36 minutes 
prior to the detection time of the transient pulse were used 
for calibrating the visibilities. 

As a demonstration of our event localisation strategy, a 
snap-shot image was made of a 3° x 3° region (nominal full 
beam width ~1.4°) of the sky centred at the phase centre 
of the survey pointing (RA = 17h 57m 51.48s, DEC 
— 27d 36' 00.0"). The pulsar was clearly detected in the 
image along with several other point sources in the field 
(see Fig. 16). The estimated pulsar position of RA — 17 52 
58.746 ± 0.024, DEC = -28 06 36.09 ± 0.41 is within 2-3cr 
of the catalog position^ and the measured flux ^ 80 ± 9 
roughly agrees with the expected flux after scaling for the 
primary beam. 

It is worth noting that even at this relatively bright flux 
level, there are a number of sources within the FoV. A cross 
check of the source positions in the image with the NVSS 



The actual positional uncertainties will be of the order of one third 
of the beam size, i.e. approximately 0.3 s in RA and 3" in DEC. 



catalogue shows a good correspondence. The large number 
of "confusing" sources may partly be because the target 
field is close to the Galactic centre (^ = 1.5°, b — —1°). 
At fainter flux levels however, one would expect that there 
would be a number of background sources that would be 
present in the FoV. To distinguish between these and the 
transient source, we may apply either of the following two 
strategies: (i) make a fresh image centred on the time 
range during which the transient was the brightest - pre- 
sumably the only source in this image whose flux will vary 
will be the transient; or, (ii) redo the transient search with 
a phased-array beam centred on each of the candidate 
sources - the signal-to-noise ratio would be the largest 
when the antennas are phased up toward the right posi- 
tion. 

6.3. Phased array for improved signal detection and 
confirmation 

In addition to producing incoherent array beams, the 
GSB beamformer can also generate coherent (phased) ar- 
ray beams. This involves performing suitable addition 
of pre-detected voltage samples from individual antennas. 
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Coherent beam formation however requires cahbrating out 
the antenna based phase offsets before the voltage samples 
can be added. These antenna based phases are solved us- 
ing the recorded cross-correlations on a calibrator source 
near the target position, typically observed alongside the 
observations . As outlined in Roy et al. (2010), these 
phases are applied after the FFT stage, as an additional 
term in the fringe corrections. 

As discussed in § 2.1, the incoherent array beam has 
the same field-of-view as the primary beam of a single an- 
tenna, but with an enhanced sensitivity of \J Na times that 
of a single antenna, for an array of A^o antennas. However, 
the coherent array beam is much narrower than that of a 
single antenna - similar to the synthesized beam obtained 
from the array of A^^ antennas, and therefore results in a 
sensitivity improvement of Na times than that of a sin- 
gle antenna. Hence by forming the coherent array beam 
towards the target source after phasing up the array, we 
expect a/Aq sensitivity improvement compared to the in- 
coherent array. 

As a demonstration of the follow-up strategy outlined 
above, we formed a phased-array beam at the position of 
the transient pulse in Fig. 16. The pulse was detected at a 
significance of 5a at a time resolution of ssO.25 ms.^ The 
initial detection (at this resolution) and the final detec- 
tion from processing the phased-array data are shown in 
Fig. 17 (top and bottom panels respectively). Data on the 
same calibrator source 1830-360 (i.e. recorded ~30 min 
prior to the detection of the pulse) were used to solve for 
the antenna based phases required for phasing up the ar- 
ray. As seen from the figure there is 6 times improvement 
in the signal-to-noise ratio compared to the initial detec- 
tion from the search pipeline. This is almost 70% of the 
theoretically expected improvement. A similar analy- 
sis was conducted on multiple other pulse detections and 
it suggests that up to ~80% of the theoretically expected 
improvement may be achievable in practice. Even so, sig- 
nificant S/N improvements (as much as a factor 10) are 
still achievable in the final detections. 

The discrepancy may be attributed to plaiisibk^ calibra- 
tion inaccuracies or some possible depliasing of arm anten- 
nas (due to ionospheric effects) given the 12.42° separation 
between the pulsar and calibrator positions. In-beam cal- 
ibration may help alleviate this in principle, however the 
GMRT's FoV may often limit its prospects; e.g. while 
there are multiple point sources in Fig. 16, the brightest 
source has a flux of only ^100 mJy, not good enough to 
derive reliable calibration solutions. However, this will 
no longer be a limitation for future wide-FoV instruments 
such as MWA, LOFAR and ASKAP that will contain mul- 
tiple potential calibrator sources in any given field. 

7. FUTURE WORK 

While the analysis presented in this paper is largely 
based on our pilot survey data at 325 MHz, it is important 

5 

Even though a higher significance is possible via matched filter- 
ing, wc limit the time resolution to 0.25 ms for the purpose of this 

analysis. 
6 

Only 19 of the 30 antennas were phased up for the final detection. 
Antennas with poor phase solutions were flagged from the analysis 
to maximise the signal detection. 



to ascertain the efficacies and limitations of conducting 
transient searches at different frequencies of the GMRT. 
For instance, the RFI environment varies significantly be- 
tween different frequencies and it will be very useful to 
investigate the effectiveness of snap-shot imaging for devel- 
oping immunity against a variety of RFI-generated events. 
This will be the subject of a future publication. While the 
lower frequencies provide the basic advantage of compara- 
tively larger fields-of-view (e.g. ^6 deg^ at 150 MHz com- 
pared to ~1.5 deg^ at 325 MHz), they also imply increased 
challenges in terms of having to deal with more severe RFI 
environments. The higher frequencies (e.g. 1400 MHz), 
on the other hand, while helping to extend the parameter 
space (i.e. searching out to larger DMs), may necessitate 
trading-off the achievable field of view (~0.1 deg^). We 
will conduct similar pilot surveys for all other frequencies 
GMRT to further optimise our processing pipelines and 
search algorithms. 

Our eventual goal is a transient detection system for the 
GMRT that functions in a commensal mode with other 
observing programs. Having demonstrated the efficacies 
of multiple (incoherent) sub-arrays for initial detection 
and interferometric capabilities for on-sky localization, the 
next logical step is a real-time implementation of such a 
pipeline. The GPU-optimized dedispersion software devel- 
oped at Swinburne (Barsdell et al. 2012) has been bench- 
marked for the GMRT frequencies and the current record- 
ing bandwidth of 32 MHz, and can handle up to ~550 trial 
DMs (at both 325 and 610 MHz). The matched-filtering 
based detection is relatively inexpensive computation- wise 
and can easily be integrated, however recovering the loss 
in sensitivity (due to sub-arraying) through the use of 
very low detection thresholds (i.e. ^2-3(t compared to 
'~5-6cr typically used in transient searches) may require 
some optimisation of the downstream algorithms for event 
scrutiny. 

We have outlined and demonstrated a specific approach 
for transient detection with interferometric arrays. While 
we advocate the use of multiple incoherent sub-arrays and 
coincidence checks as a promising strategy, there may be 
other possibilities that are worth exploring within the 
general context of next-generation instruments such as 
ASKAP, MeerKAT and the SKA, including, for exam- 
ple, the use of multiple sub-arrays to achieve larger fields- 
of-view (i.e. by pointing the sub-arrays in different re- 
gions of the sky). The improved sensitivity achievable 
through the use of wider-bandwidth recorders, and the 
constraints that may arise in terms of data rates and pro- 
cessing needs, are also among important aspects that need 
investigation. As the GMRT gets upgraded in the coming 
years through commissioning of its broad-band receivers 
and backends, new avenues will be opened up for under- 
taking such more promising, albeit more complex, science 
demonstrator projects relevant in the SKA-era. 

8. SUMMARY AND CONCLUSIONS 

While large single-disli instruments currently dominate 
time-domain science applications such as pulsars and fast 
transients, the future lies in the effective use of large- 
element interferometric arrays. The GMRT, with its mod- 
est number of elements and long baselines, makes a pow- 
erful platform for developing the necessary techniques and 
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methodologies. In particular, its sub-array and interfero- 
metric capabilities can be well exploited for efficient detec- 
tion of fast transients as well as for their accurate on-sky 
localisation. 

Among the various considerations in the use of arrays 
for transient exploration is the tradc-ofF between the field 
of view and absolute detection sensitivity. We have postu- 
lated the basic idea of generating a relatively small num- 
ber of incoherently summed sub-arrays from the full array 
and then combining the results of detections of candidate 
transient events from each of these sub-arrays so as to op- 
timise the rejection of false positives due to receiver noise 
and from RFI, using suitably devised coincidence filtering 
techniques. As we demonstrate throiigh multiple exam- 
ples and analysis, this enables reaching the sensitivity of 
the full phased array, while preserving the full FoV of the 
single antenna element. This approach is promising as it 
offers a dramatic improvement in terms of the prospects 
of detecting weaker signals. For example, a ^2a detection 
from initial processing will eventualy be a ~10cr signal 
after phasing-up the full array, and hence will be both un- 
ambiguously verifiable in time series as well as localizable 
(on sky) at arc second accuracies. 

The GMRT software backend allows raw voltage data 
from individual array elements to be recorded and made 
available for software-based pipelines. We have exploited 
this optional feature to develop and implement a transient 
detection pipeline for the GMRT. This includes a beam- 
former that operates on 2 x 30 raw voltage data streams 
to produce multiple incoherently summed sub-arrays, the 
data from which are then dedispersed and searched for 
transient events. The resultant events arc scrutinised by 
the coincidence algorithms that take into account likely 
differences in the detection sensitivity between different 
sub-arrays that may result from either local RFI, or from 
one or more array elements performing at less than their 
nominal sensitivities. We have also explored the effective- 
ness of the algorithms as a function of the detection thresh- 
old as well as sub-arraying, and our analyses suggest that 
four sub-arrays make an optimal choice for transient de- 
tection with the GMRT. 

Important future work includes undertaking pilot sur- 
veys at different frequencies of the GMRT in order to fur- 
ther optimise the strategies and algorithms for transient 
detection with arrays, and the development of a real-time 
version of the transient pipeline that can work in commen- 
sal mode. 

The work described here, while demonstrating the ap- 
plications of interferometric arrays for fast transient explo- 
ration - an important preparatory step for planned science 
with the SKA pathfinder instruments such as ASKAP and 
MeerKAT also forms the first step to add new capabil- 
ities to the GMRT in the exciting arena of charting the 
transient sky at radio wavelengths. 
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9. APPENDIX: PROBABILITY OF FALSE ALARMS FOR 
SUB-ARRAY CONFIGURATIONS 

For the case where the random noise output signal of 
a radio telescope antenna follows a Gaussian distribution 
with zero mean and variance of o"^ , the probability of the 
signal level crossing a threshold T is given by: 

or 

P(> T)= r ^ eM-y^]dy = ^Erfc (^) (4) 

where Erfc is the complimentary Error Function. We 
call P(> T) the probability of false alarms (PFA), as these 
excursions would lead to false triggering of a transient de- 
tection pipeline where the detection threshold is set to 
T cr. 

For an array of N such antennas, the signal can be 
combined in different ways. For a coherent phased array, 
the voltage signals from individual antennas are added in 
phase and then squared to get the total intensity, and then 
further integrated in time and frequency as required to get 




I , , , . I.I I 

0.01 0.1 1 10 

Detection Threshold (in units of ct) 

Fig. 18. — The change in the computed PFA with threshold r (in 
units of a) for the case of one lA beam with A'^ antennas in the sub- 
array (solid line) and for the case of A'^ number of lA beams with 
one antenna in each sub-array (dashed line). The symbols mark the 
results from numerical simulations using Gaussian random noise. 
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the final output. For an incoherent array, the intensity sig- 
nals from individual antennas are added to get the total 
intensity signal and then integrated to the desired time 
and frequency resolution. The effective a decreases by a 
factor of TV or -v/iV, for the coherent and incoherent array 
outputs, respectively (Gupta et al. 2000). Thus, for an 
incoherent array (lA) of A'' antennas, the effective sigma 
is : 



(7 

TV 



(5) 



where is the variance for a single antenna. 

In what follows we consider the following three cases of 
incoherent array : 

(A) Single lA beam from a single sub-array of N 
antennas: 

For this case the probability of false alarms can be ex- 
pressed as (from eqn Al above): 



sensitivity that each sub-array suffers from (compared to 
the case of a single sub-array of A antennas) , while offering 
improved immunity against local interference signals that 
don't pass the coincidence filtering test. 

As an alternate illustration of the ideas, fig A2 shows the 
ratio of Pc to Pb (on a log scale) as a function of number 
of antennas in the sub-array, n, for different choices of total 
number of antennas. A, and for a fixed choice of r = 3.0. 
Moving along any of these curves from n = A to n = 1 
illustrates how the PFA reduces as more numbers of sub- 
arrays are used. 

Thus, it is possible to reduce the probability of false 
alarms by using multiple sub-arrays, and this can be used 
to trade-off sensitivity (via different threshold values) vs 
false alarm rate to optimise the performance from the en- 
tire array. 



P.(>T) = ^ErfcUf 



(6) 



where r = T/a is the detection threshold in units of a. 

(B) N lA beams from N sub-arrays, each with 
one antenna: 

In this case the false alarms due to noise statistics are 
independent events in each sub-array output, and if we 

use a coincidence filtering scheme where a false alarm is 
declared only if it is present simultaneously in all sub- 
array outputs, then the PFAs from all the lA beams get 
multiplied to give 



Pb{> T) 



1 ( T 
-Erfc —— 
2 



N 



(7) 



(C) p lA beams from p sub-arrays, each with n = 
N/p antennas: 

The most general case is to have p = N/n sub-arrays 
with each having n antennas, for which the effective PFA 
is given by 



Pc{> T) 



(8) 



Note that for p = 1, n = A, Pq = Pa and p ~ N,n = 
l,Pc — Pb- Also, it is easy to show that, for a given 
threshold r = T/a, Pb < Pc < Pa- This is illustrated 
in Fig. 18, which plots the PFA as a function of r for the 
two cases Pb and Pa, for the GMRT value of A = 30. 
Results from simulation runs using random Gaussian noise 
(denoted by the symbols) are also overplotcd on the the- 
oretically calculated curves. As can be seen, for small 
threshold values, logPg is less than logP^ by a factor of 
10 and this ratio increases with r for values of r > 1, 
reaching a value of around 26 for a threshold of 3.0. The 
curve for Pc would lie in between the curves for Pa and 
Pb. 

From this, it is evident that there exists a possibility 
for trading off between PFA and r, for different choices of 
sub-arrays. For example, in order to have the same PFA 
for different combinations of sub-arrays, it is possible to 
work at lower thresholds for cases where the array is split 
into sub-arrays. This can offset the basic reduction in 
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Transient detection with the GMRT 



Table 1 



Coincidence filtering: event list summary 





^ real 


^falsc 


,/rcal 


./false 


(1) (2) 


(3) 


(4) 


(5) 


(6) 


Data segment 


1(0- 


150 seconds) 




1 3.5 


62 


1193 


0.25 


1 


2 2.5 


113 


615 


0.45 


0.52 


3 2.0 


121 


653 


0.49 


0.55 


4 1.7 


112 


367 


0.45 


0.31 


5 1.5 


108 


537 


0.43 


0.45 


Data segment 


2 (150 


- 300 seconds) 




1 3.5 


92 


1243 


0.39 


1 


2 2.5 


146 


601 


0.62 


0.48 


3 2.0 


119 


761 


0.51 


0.61 


4 1.7 


133 


393 


0.57 


0.32 


5 1.5 


131 


564 


0.56 


0.45 



Note. Results from a coincidence filtering analy- 
sis of the survey field GTC_002.52-1.64 where a known 
pulsar (PSR J1752— 2806) was present at an offset of 
ssl.2° from the beam phase centre (i.e. outside the half 
power primary beam; see Fig. 16). Number of sub- 
arrays (iVsub) varied from one to five, and the detection 
thresholds are scaled down assuming the theoretically 
expected ^/n, where n is the number of antennas per 
sub-array; the number of real events and false positives 
(-^reai ^nd VVfaise) are tabulated along with the corre- 
sponding fractions (/real and /false)- 



